Skip to content

BUG: Allow numeric ExtensionDtypes in DataFrame.select_dtypes #38246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Dec 14, 2020

Conversation

arw2019
Copy link
Member

@arw2019 arw2019 commented Dec 2, 2020

Picking up #35341 (original PR addressed all comments - here I merged master + added whatsnew note)

cc @simonjayhawkins @jreback @jbrockmendel

@arw2019 arw2019 changed the title ENH: Allow numeric ExtensionDtypes in DataFrame.select_dtypes BUG: Allow numeric ExtensionDtypes in DataFrame.select_dtypes Dec 2, 2020
if issubclass(
unique_dtype.type, tuple(dtypes_set) # type: ignore[arg-type]
)
if np_issubclass_compat(unique_dtype, dtypes_set)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need to create yet another method here. is there a reason you cannot just use something like

return issubclass(unique_dtype.type, tuple(dtypes_set)) or (
        np.number in dtypes_set and is_numeric_dtype(unique_dtype))
    )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no reason. Done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having now tried this I think the reason might be mypy

pandas/core/frame.py:3715: error: Argument 1 to "tuple" has incompatible type "FrozenSet[Union[ExtensionDtype, str, Any, Type[str], Type[float], Type[int], Type[complex], Type[bool], Type[object]]]"; expected "Iterable[Union[type, Tuple[Any, ...]]]"  [arg-type]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to having a separate method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just ignore mypy here, this makes groking the code way more complicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok!

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. labels Dec 3, 2020
)
or (
np.number in dtypes_set
and hasattr(unique_dtype, "_is_numeric") # is an extensionarray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u check is_extension_array_dtype here instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@arw2019 arw2019 closed this Dec 14, 2020
@arw2019 arw2019 reopened this Dec 14, 2020
@arw2019
Copy link
Member Author

arw2019 commented Dec 14, 2020

Green + addressed comments

or (
np.number in dtypes_set
and is_extension_array_dtype(unique_dtype)
and unique_dtype._is_numeric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use getattr(unique_dtype, '_is_numeric', False) as we don't actually require this on an EA type (maybe we should though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

is_selected = df.select_dtypes(np.number).shape == df.shape
assert is_selected == expected

# da = DummyArray([1, 2], dtype=DummyDtype(numeric=False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@arw2019
Copy link
Member Author

arw2019 commented Dec 14, 2020

Green + addressed comments

@jreback jreback added this to the 1.3 milestone Dec 14, 2020
@jreback jreback merged commit deaf138 into pandas-dev:master Dec 14, 2020
@jreback
Copy link
Contributor

jreback commented Dec 14, 2020

thanks @arw2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Select numeric ExtensionDtypes with DataFrame.select_dtypes
4 participants